53 research outputs found

    An Abstract Method Linearization for Detecting Source Code Plagiarism in Object-Oriented Environment

    Full text link
    Despite the fact that plagiarizing source code is a trivial task for most CS students, detecting such unethical behavior requires a considerable amount of effort. Thus, several plagiarism detection systems were developed to handle such issue. This paper extends Karnalim's work, a low-level approach for detecting Java source code plagiarism, by incorporating abstract method linearization. Such extension is incorporated to enhance the accuracy of low-level approach in term of detecting plagiarism in object-oriented environment. According to our evaluation, which was conducted based on 23 design-pattern source code pairs, our extended low-level approach is more effective than state-of-the-art and Karnalim's approach. On the one hand, when compared to state-of-the-art approach, our approach can generate less coincidental similarities and provide more accurate result. On the other hand, when compared to Karnalim's approach, our approach, at some extent, can generate higher similarity when simple abstract method invocation is incorporated.Comment: The 8th International Conference on Software Engineering and Service Scienc

    The Effectiveness of Low-Level Structure-based Approach Toward Source Code Plagiarism Level Taxonomy

    Full text link
    Low-level approach is a novel way to detect source code plagiarism. Such approach is proven to be effective when compared to baseline approach (i.e., an approach which relies on source code token subsequence matching) in controlled environment. We evaluate the effectiveness of state of the art in low-level approach based on Faidhi \& Robinson's plagiarism level taxonomy; real plagiarism cases are employed as dataset in this work. Our evaluation shows that state of the art in low-level approach is effective to handle most plagiarism attacks. Further, it also outperforms its predecessor and baseline approach in most plagiarism levels.Comment: The 6th International Conference on Information and Communication Technolog

    Dynamic Thresholding Mechanisms for IR-Based Filtering in Efficient Source Code Plagiarism Detection

    Full text link
    To solve time inefficiency issue, only potential pairs are compared in string-matching-based source code plagiarism detection; wherein potentiality is defined through a fast-yet-order-insensitive similarity measurement (adapted from Information Retrieval) and only pairs which similarity degrees are higher or equal to a particular threshold is selected. Defining such threshold is not a trivial task considering the threshold should lead to high efficiency improvement and low effectiveness reduction (if it is unavoidable). This paper proposes two thresholding mechanisms---namely range-based and pair-count-based mechanism---that dynamically tune the threshold based on the distribution of resulted similarity degrees. According to our evaluation, both mechanisms are more practical to be used than manual threshold assignment since they are more proportional to efficiency improvement and effectiveness reduction.Comment: The 2018 International Conference on Advanced Computer Science and Information Systems (ICACSIS

    TF-IDF Inspired Detection for Cross-Language Source Code Plagiarism and Collusion

    Get PDF
    Several computing courses allow students to choose which programming language they want to use for completing a programming task. This can lead to cross-language code plagiarism and collusion, in which the copied code file is rewritten in another programming language. In response to that, this paper proposes a detection technique which is able to accurately compare code files written in various programming languages, but with limited effort in accommodating such languages at development stage. The only language-dependent feature used in the technique is source code tokeniser and no code conversion is applied. The impact of coincidental similarity is reduced by applying a TF-IDF inspired weighting, in which rare matches are prioritised. Our evaluation shows that the technique outperforms common techniques in academia for handling language conversion disguises. Further, it is comparable to those techniques when dealing with conventional disguises

    Improving Scalability of Java Archive Search Engine Through Recursion Conversion and Multithreading

    Full text link
    Based on the fact that bytecode always exists on Java archive, a bytecode based Java archive search engine had been developed [1, 2]. Although this system is quite effective, it still lack of scalability since many modules apply recursive calls and this system only utilizes one core (single thread). In this research, Java archive search engine architecture is redesigned in order to improve its scalability. All recursion are converted to iterative forms although most of these modules are logically recursive and quite difficult to convert (e.g. Tarjan's strongly connected component algorithm). Recursion conversion can be conducted by following its respective recursive pattern. Each recursion is broke down to four parts (before and after actions of current and its children) and converted to iteration with the help of caller reference. This conversion mechanism improves scalability by avoiding stack overflow error caused by method calls. System scalability is also improved by applying multithreading mechanism which successfully cut off its processing time. Shorter processing time may enable system to handle larger data. Multithreading is applied on major parts which are indexer, vector space model (VSM) retriever, low-rank vector space model (LRVSM) retriever, and semantic relatedness calculator (semantic relatedness calculator also involves multiprocess). The correctness of both recursion conversion and multithread design are proved by the fact that all implementation yield similar result

    Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine

    Full text link
    Byte code as information source is a novel approach which enable Java archive search engine to be built without relying on another resources except the Java archive itself [1]. Unfortunately, its effectiveness is not considerably high since some relevant documents may not be retrieved because of vocabulary mismatch. In this research, a vector space model (VSM) is extended with semantic relatedness to overcome vocabulary mismatch issue in Java archive search engine. Aiming the most effective retrieval model, some sort of equations in retrieval models are also proposed and evaluated such as sum up all related term, substituting non-existing term with most related term, logaritmic normalization, context-specific relatedness, and low-rank query-related retrieved documents. In general, semantic relatedness improves recall as a tradeoff of its precision reduction. We also proposed a scheme to take the advantage of relatedness without affected by its disadvantage (VSM + considering non-retrieved documents as low-rank retrieved documents using semantic relatedness). This scheme assures that relatedness score should be ranked lower than standard exact-match score. This scheme yields 1.754% higher effectiveness than our standard VSM

    The Use of Python Tutor on Programming Laboratory Session: Student Perspectives

    Get PDF
    Based on the fact that the impact of educational tools can only be accurately measured through student-centered evaluation, this paper proposes a long-term in-class evaluation for Python Tutor, a program visualization tool developed by Guo. The evaluation involves 53 students from 4 Basic Data Structure classes, which were held in the even semester of 2016/2017 academic year. It is conducted based on questionnaire survey asked to the students after they have used Python Tutor in their half of programming laboratory sessions. In general, there are three findings from this work. Firstly, Python Tutor helps students to complete programming laboratory tasks, specifically for Basic Data Structure material. Secondly, Python Tutor helps students to understand general programming aspects which are execution flow, variable content change, method invocation sequence, object reference, syntax error, and logic error. Finally, based on student perspectives, Python Tutor is a helpful tool positively affecting the students

    Complexitor: an Educational Tool for Learning Algorithm TIME Complexity in Practical Manner

    Full text link
    Based on the informal survey, learning algorithm time complexity in a theoretical manner can be rather difficult to understand. Therefore, this research proposed Complexitor, an educational tool for learning algorithm time complexity in a practical manner. Students could learn how to determine algorithm time complexity through the actual execution of algorithm implementation. They were only required to provide algorithm implementation (i.e. source code written on a particularprogramming language) and test cases to learn time complexity. After input was given, Complexitor generated execution sequence based on test cases and determine its time complexity through Pearson correlation. An algorithm time complexity with the highest correlation value toward execution sequence was assigned as its result. Based on the evaluation, it can be concluded this mechanism is quite effective for determining time complexity as long as the distribution of given input set is balanced

    AP-ASD1 : an Indonesian Desktop-based Educational Tool for Basic Data Structure Course

    Full text link
    Although there are so many avalaible data structure educational tools, it is quite difficult to find a suitable tool to aid students for learning certain course [1]. Several major impediments in determining the tool are teaching preferences, language barrier, confusing terminologies, internet dependency, various degree of material difficulty, and other environment aspects. In this research, a data structure educational tool called AP-ASD1 is developed based on basic algorithm and data structure course (ASD 1). Since AP-ASD1 is developed following course materials and not vice versa, this educational tool is guaranteed to fit in our needs. The feasibility of AP-ASD1 is evaluated based on two factors which are functionality correctness and survey. All features are correctly functioned and yield expected output whereas survey yields fairly good result (84,305% achievement rate). Based on our survey, AP-ASD1 meets eligibility standard and its features are also successfully integrated. Survey also concludes that this application is also quite effective as a supportive tool for learning basic data structure
    • …